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We report on a long-term study which was executed in a German secondary school 
with 128 eights graders (ages 14 to 15) in four different classes. Two of these classes 
served as control groups. The mathematics lessons of the other two classes (treatment 
groups) were frequently enriched by distinguished phases in which structured 
argumentation and the use of heuristics was trained. The study aimed at investigating 
the development of the argumentation competence of the students over that period. For 
this report, the products of four different geometry tasks of 15 students from one of the 
treatment groups and 15 from one of the control groups respectively were evaluated. 


INTRODUCTION 


Both “reasoning and proof” and “problem solving” are important parts of mathematics 
curricula all around the world (e.g., NCTM 2000). Though both deal with aspects of 
producing mathematical argumentation, mathematics educators tend to 
compartmentalize those two domains (Mamona-Downs & Downs 2013). Problem 
solving is being perceived as focusing on progressing work, whereas the proof 
tradition highlights evaluating the soundness of the product of reasoning (cf. ibid.). 


We report on a 1.5-year study covering two experimental and two control classes 
emphasizing reasoning and proof as well as problem solving. In this paper, we confine 
ourselves to the “reasoning and proof’ part of this study with a focus on the 
methodology of rating the students’ products. Additionally, we present initial results 
by highlighting quantitative (scores) as well as qualitative (ways of reasoning) 
analyses of the students’ products at the beginning and at the end of the study. 


THEORETICAL BACKGROUND 


Reasoning and proof is a significant aspect of mathematics and therefore also 
important for mathematics at school. It 1s, however, very difficult for students of all 
grades up to university level to generate or even read proofs on their own. Reid and 
Knipping (2010, p. 68 ff.) summarize several studies regarding the construction of 
proofs, which all agree on the fact that most students cannot write a correct proof. 


There is a need for good teaching concepts regarding reasoning and proof as well as for 
studies that accompany related teaching experiments. An important part of such studies 
are methods to measure the argumentational competencies of the participating 
students. These methods need to be able to account for the (partially) complex 
structures of proofs, to appropriately compare different approaches and levels of 
elaboration of proofs, and consequently to show progress in the generation of proofs. 


2014. In Nicol, C., Liljedahl, P., Oesterle, S., & Allan, D. (Eds.) Proceedings of the Joint Meeting 2 - 193 
of PME 38 and PME-NA 36,Vol. 2, pp. 193-200. Vancouver, Canada: PME. 


Articles published in the Proceedings are copyrighted by the authors. 


Brockmann-Behnsen, Rott 


Many researchers studying reasoning and proof use the Toulmin (1958) model which 
has been developed to reconstruct arguments in different fields (cf. Knipping 2008). 
According to Toulmin, the basic structure of rational arguments can be described as 
consisting of the pair of datum and conclusion. As this step might be challenged, a 
warrant can be added to justify it. Toulmin adds additional elements to his model (like 
qualifiers that can restrict the conclusion or backing for warrants) as do other 
researchers that use it. For example, Ubuz et al. (2012) add elements to describe 
statements and actions of teachers in classroom situations (like guide-redirecting) and 
specifications of existing elements (like deductive warrant and reference warrant). 


However, the Toulmin model has its limitations. For example, it “is not adequate for 
more complex argumentation structures [in classrooms ]” (Ubuz et al. 2012, p. 168) and 
it “de-emphasises the times” (Knipping 2008, p. 439) and thus is not able to outline the 
development of argumentations. Most notably, the Toulmin model is not designed to 
analyze written argumentations such as students’ solutions of proof tasks. Analyses of 
students’ solutions with this model would mostly contain of data and conclusions, 
missing rebuttals of dialogue partners and according backings. 


As an alternative method to reconstruct argumentation steps and streams in written 
work of students, we propose in this article an adapted version of the multigraph 
representation by Konig (1992). He uses different graphical elements to denote 
elements like “starting quantities”, “solution state” and “intermediate states” as well as 
logical derivations between states and heuristics elements that might help proceeding 


from one state to another (see the Methodology part for an example of such a graph). 


K6nig had designed his method which he refers to as a “solution plan” to compare 
written solutions of proof tasks — be it different solutions of the same task or solutions 
of different tasks. The standardized way of depicting an argumentation allows for a 
mostly objective analysis of students’ work in different states of elaboration. 


Our research intention is to adapt the solution plan sensu Konig to our study and to 
apply it onto the written argumentations (the products) of students that worked on 
mathematical problems and proof tasks. A secondary research question deals with 
detecting differences between and improvements of the argumentative competence of 
the students that underwent our training compared to those from the control group. 


DESIGN OF THE STUDY 


The HeuRekAP' study was launched at the beginning of the 2011/2012 school term 
(August 2011) in a German secondary school and lasted for one and a half years (until 
the end of January 2013, see Figure 1). It covered the whole eighth grade consisting of 
four parallel classes. Altogether there were 128 students initially aged 14 to 15. Two of 
these classes were continuously taught by the first author (treatment groups T,; and T>), 


' Heuristisch Rekonstruiertes Arbeiten und Problemlésen means Heuristically Reconstructed 
Working and Problem Solving, for details of the concept of Heuristical Reconstruction see Gawlick 
(2013). 
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the two others served as control groups (C,, C2). Treatment group T, and control group 
C, were both mathematical profile classes, which implies an additional mathematics 
lesson per week in grades seven and nine. 


a HeuRekA 


Figure 1: Overview of the study and the ascertainments relevant for this paper 


For this paper, 15 students from each of the profile classes (T, and C,) have been 
chosen by two criteria: (a) The selected students from both classes were supposed to be 
comparable with each other referring to their initial performance (parallelized 
samples). This was measured by the average school marks in Mathematics and German 
over the past four years before the study started. (b) The selected students should have 
had an above average motivation to participate in the ascertainments of the study. 
Therefore a survey on motivation was conducted at the beginning of the study. 


The mathematics lessons of treatment of group T, included distinguished phases in 
which structured argumentation and proving as well as the use of heuristics were 
trained. The students were involved into the whole process of proving according to 
Boero (1999) and learned to write down their proofs in the Two-Column-Format (cf. 
Herbst 2002). Amongst the heuristics they became familiar with are the use of 
auxiliary elements, principles like analogy and strategies like working backwards. See 
Brockmann-Behnsen (2013) for an example of a typical educational unit. 


At regular intervals, sets of reasoning problems have been given to the students. 
Relevant for this paper are two items of the pretest, which was handed out before the 
treatment started, and three items of the posttest. The problems Rhombus 1, given in 
the pretest, and Rhombus 2, given in posttest are similar, Angle was given to the 
students both in the pretest and the posttest. With these pairs of problems the 
development of the quality of argumentation can be examined. Additionally, K10 was 
given to the students at the end of the study — with no matching pretest item because it 
is more complex than the other problems (see Table 1 for the problems). 
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Rhombus 1 (Pretest) Rhombus 2 (Posttest) 


A rhombus is divided into | A rhombus is defined as 
two triangles by its | a quadrilateral with four 
diagonal. | sides of equal length. 


Demonstrate that these two 


triangles are congruent. ; Se 
Given: A rhombus with opposite interior angles a 


Write down all your and B. 
considerations and argu- 


ments step by step. Prove: |a|=|B| 


Source: Griesel et. al. 2006, p. 27 f. Source: Beuthan 2008”, p. 53 


Angle = (Pretest/ K10 


Posttest) AB is the diameter of a semicircle k, C is an 


arbitrary point on the semicircle (other than A or 
B) and M is the center of the circle inscribed into 
AABC. Determine the value of ZAMB 


Determine the value of angle a. 


Write down all your considerations. 


Source: Lergenmiiller et. al. 2006, p. 64. Source: TIMSS III’ 
Table 1: The four tasks selected for the analyses in this paper 


METHODOLOGY 


The research questions stated in this paper demand an instrument which is suitable to 
analyze and categorize the quality of argumentation in the students’ products. These 
products often differ strongly in their form and structure. The spectrum ranges from 
disjointedly noted statements — partly written in mathematical symbols — over prosaic 
texts up to highly structured Two-Column-notations. 


Therefore in a first step it is necessary to transform this variety of forms into one 
standardized format to facilitate comparability of the products. Orientated multigraph 
representations sensu Konig (1992, p. 25) serve as a basis for this standardized format. 
The vertices of these multigraphs comprise of the given magnitudes framed by circles, 
operators like Thales Theorem (TT) or the Angle Sum Theorem (AST) framed by 
rhombuses, intermediate aims surrounded by a mixture of rectangles and circles and 
the target magnitude enclosed into a rectangle. 


* In contrast to the TIMSS III format in this study no solution alternatives were given to the students. 
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Figure 2: T,-04-K10 original notations and standardized representation 


The orientated multigraph representation depicts a survey of a complete solution of the 
given problem and highlights all the details reached by the student and their relation to 
each other. Figure 2 gives an example of such a representation. Shown are the original 
notations of student T,-04 who worked on problem K10. The notations have been 
parsed into units that correspond to intermediate aims, identified operators or phrases 
that indicate connections between them. Beneath the original notations the appendant 
multigraph representation can be seen. The units of the original notations have been 
registered within this standard solution. 


In a second step the quality of the students’ argumentations were graded into six 
categories (Cat. 0 to Cat. 5) based upon the multigraph representation (see Table 2). 


The notations of student T|-04-K10 as stated in Figure 2 consist of some intermediate 
aims and a logical connection between the operator Thales Theorem (TT) with its 
conclusion |y| = 90°. The required premises for the application of that operator are not 
stated. Therefore these notations were categories into Cat. 2 (Molecules). 
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Cat. No access: No detail of the students’ notations is relevant for the 


; Blank 
0 solution. 
Cat. Atoms: At least one detail of the students’ notations (operator, x (_) 
1s intermediate aim etc.) is relevant for the solution. 


Cat. Molecules: At least two details of the students’ notations are logically -C_) 
2 connected with each other. 


Deductive Cells: At least one correct and complete elementary 
deduction that is relevant for the solution can be found. This is called a 
Deductive Cell. It includes the notation of the premises required for the 
application of an operator, the operator itself and the correct conclusion 
derived by the application of that operator on the stated premises. 


Deductive Torso: At least two deductions relevant for the solution are 
Cat. logically connected with each other. Either one of the connected 
4 deductions or the connection itself is correct and complete (existence of 
at least one Deductive Cell). 


Cat. Deductive Body: A complete solution without any logically deficits is | Complete solution 
5 _ being given. graph 


Table 2: Categories for grading the students’ products 


RESULTS 


For all tasks presented in this paper as well as additional ones within this study, the 
coding of students’ written argumentations by representing it with an oriented 
multigraph and grading it into one of the six categories (Cat. 0 to Cat. 5) proved to be 
highly objective and reliable. Interrater correlations for 5 randomly selected students’ 
products per task have been calculated. The percentage of agreement scores for 
researchers who have coded the products individually range between 65% and 100% 
with the median interrater correlation being 83%. 


The coding of the students’ products into categories via the multigraph representations 
allows us to compare their argumentative performances. For this report, we examined a 
parallelized sample of 15 students each from the treatment group T, and the control 
group C,. Because of the fact that the category coding yields only ordinal data and 
because of the small sample size, in the following we use non-parametric statistical 
methods like interquartile ranges and chi-square-tests instead of parametric methods 
like standard deviation and t-tests. 


Comparing the two groups shows that they scored equally at both pretest items as it 
was expected because of the parallelization with regard to previous achievement. The 
three posttest items, however, show a significant difference in favor of the treatment 
group (see Table 3). This was proven by chi-square-test (y2 = 19,72, p < 0,0001). 
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Rhombus 1 | Rhombus 2 | Angle (pre) | Angle (post); K10 
T,: median (interquartile range) 2 (1) 4(1) 2 (3) 4 (0.5) 2 (0.5) 
C,: median (interquartile range) 2 (1) 1 (2) 2 (2) 2 (2) 1 (1) 


Table 3: The mean results of the students for each task 


This result can be supported by an analysis of the individual development of the 
students between the two matching pairs of pre-posttest items (Rhombus 1/2 and 
Angle pre/post). From the tasks of the pretest to the tasks of the posttest only 5 out of 
30 products of the treatment group had no change or even a decline in their categories, 
whereas 21 products ascended by two or more categories. In the control group, 18 out 
of 30 products had no change or even a decline in category from the pretest tasks to the 
posttest tasks and only 5 products increased by two or more categories. 


We like to illustrate the development of the students argumentative competence 
exemplarily by the elaborations of student T1-15 working on the Angle Problem in the 
pretest (A) and in the posttest (B). In the pretest the student merely states the correct 
result with the argument: “Denn: (Because:) 36°+21°=57°”. No mathematical 
connections between the given and the demanded angles are being drawn. In the 
posttest the solution is structured by a Two-Column-System and heuristic elements 
such as auxiliary lines and notations can be found. 
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DISCUSSION 


We introduced a study to foster the argumentative competencies of eighth graders. To 
examine such competencies and possible advancements, we developed a method based 
upon multigraph representations that enabled us to categorize and thereby compare 
written products of students working on mathematical problems and proof tasks. We 
challenged the objectivity of this method by measuring its interrater reliability and 
gained very satisfactory results. 


With the help of this method, we were able to grade the students’ argumentations 
before and after the 1.5-year period of our study. In accordance with the literature, 
most of the students scored quite bad results in proof tasks previous to the study. The 
control group (with no special training in heuristics and argumentational strategies) 
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showed equally poor results at the posttest. The treatment group, on the other hand, 
reached significantly better results after the training. Ongoing research has to further 
demonstrate the effectiveness of the teaching method elaborated in this study. 
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